The Effect of the KDE Plasma Baloo File Indexing and File Search Framework on System Performance

March 11, 2018, 9 p.m.

The KDE PLasma Desktop Environment's File Indexing and File Searching backend, Baloo, is an extremely fast and useful tool of the Plasma DE. It enables among other features, krunner's extremely fast population of search results. Unfortunately, depending on its configuration, it can cause intermittent unresponsiveness during its initial indexing. Having been introduced to the Phoronix Test Suite, I thought I would use it to measure the impact of baloo on system performance. This article presents some a comparison of test results of PTS tests with Baloo actively indexing files and with all Baloo processes stopped.

Introduction

According to the KDE Community Wiki page on Baloo:

Baloo is the file indexing and file search framework for KDE Plasma...Baloo focuses on providing a very small memory footprint along with with extremely fast searching. Baloo is not an application, but a daemon to index files. Applications can use the Baloo framework to provide file search results.
This development focus on the speed of searches was fruitful. Baloo search results are displayed impressively quickly. For example, in one of the applications that use the framework, krunner, extended results -- including items not in recent documents or recent applications, appear almost instantly.
Plasma's krunner Widget Listing Search Results
krunner is probably one of the programs enabled by the baloo framework.
Unfortunately, depending on the selected settings, baloo processes can cause the system to be unresponsive during the period of initial indexing. The following screenshot shows htop with a filter on baloo. htop shows that at one cycle, the subprocess of baloo_file, baloo_file_extractor is using 73.1% of CPU time and 10.3% of the 16GB memory of my Acer V15 Nitro Black Edition.
htop Showing Baloo's Resource Usage and Ksysguard Showing Overall Processor Load
The setting that actually has the negative impact on performance is whether file contents should be indexed as well as file name. If file contents are to be indexed, apparently, the baloo_file_extractor process is spawned by the main baloo_file process, and this is the process that consumes nearly all of the resources of the two baloo processes.
User Configuration of Baloo
Baloo can be enabled and locations to be indexed and ignored can be set in Plasma Settings. These settings are recorded in the main configuration file ~/.config/baloofilerc. File content indexing can is enabled in this configuration file.

When I first encountered this problem with baloo I thought it was a problem with the implementation by openSUSE Tumbleweed, but it is occurring in other distributions' implementations of Plasma as well, not because of some defect, but because in my case I have over 900,000 files in locations to be indexed and I chose to enable the indexing of file contents in addition to file names.

Phoronix Test Suite Results

I chose only a small subset of only two of the tests that are available with Phoronix Test Suite to minimize the time that the computer -- the specs of which, collected by Phoronix Test Suite, are shown below -- would be unusable to me while the tests were executing.

The Specifications of the Computer and OS As Collected By Phoronix Test Suite
The tests I chose are the Stress-NG and the Flexible IO Tester.
The Phoronix Test Suite Results Overview
The PTS tests quantify the difference in performance when Baloo is active and when it is not.
The results of the individual tests are presented in the following set of slides.
The Results of the Individual Tests for Each Test Condition
In nearly all cases, as expected, the system performed better when Baloo processes were not running. Quantitatively, the performance difference was minimal.

Conclusion

Of the twenty four test runs in this set of test runs, the system performed better in twenty-three of the twenty-four cases when the baloo processes were not running.

A Summary of Wins and Losses Based on Test Condition
This summary output was produced by phoronix-test-suite winners-and-losers. Tests performed with Baloo inactive performed better in 95.8% of runs versus when Baloo was active.